Categorical Variable Mapping Considerations in Classification Problems: Protein Application

نویسندگان

چکیده

The mapping of categorical variables into numerical values is common in machine learning classification problems. This type frequently performed a relatively arbitrary manner. We present series four assumptions (tested numerically) regarding these mappings the context protein using amino acid information. assumption involves problems without need to use approaches such as natural language process (NLP). first three relate equivalent mappings, and fourth comparable proposed eigenvalue-based matrix representation chain. These were tested across range 23 different algorithms. It shown that simulations are consistent with presented assumptions, translation permutations, eigenvalue approach generates classifications statistically not from base case or have higher mean while at same time providing some advantages having fixed predetermined dimensions regardless size analyzed protein. generated an accuracy 83.25%. An optimization algorithm also selects appropriate number neurons artificial neural network applied above-mentioned problem, achieving 85.02%. model includes quadratic penalty function decrease chances overfitting.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Kernel methods for Multi-labelled classification and Categorical regression problems

This report presents a SVM like learning system to handle multi-label problems. Such problems arise naturally in bio-informatics. Consider for instance the MIPS Yeast genome database in [12], it is formed by around 3,300 genes associated to their functional classes. One gene can have many classes, and different genes do not belong to the same number of functional categories. Such a problem can ...

متن کامل

Interpolation in variable Hilbert scales with application to inverse problems

For solving linear ill-posed problems with noisy data regularization methods are required. In the present paper regularized approximations in Hilbert scales are obtained by a general regularization scheme. The analysis of such schemes is based on new results for interpolation in Hilbert scales. Error bounds are obtained under general smoothness conditions.

متن کامل

Homotopy Classification of Categorical Torsors

The long-known results of Schreier on group extensions are here raised to a categorical level by giving a factor set theory for torsors under a categorical group (G,⊗) over a small category B. We show a natural bijection between the set of equivalence classes of such torsors and [B(B),B(G,⊗)], the set of homotopy classes of continuous maps between the corresponding classifying spaces. These res...

متن کامل

Statistics for Categorical Surveys—A New Strategy for Multivariate Classification and Determining Variable Importance

Surveys can be a rich source of information. However, the extraction of underlying variables from the analysis of mixed categoric and numeric survey data is fraught with complications when using grouping techniques such as clustering or ordination. Here I present a new strategy to deal with classification of households into clusters, and identification of cluster membership for new households. ...

متن کامل

Categorical Perception in Facial Emotion Classification

We present an automated emotion recognition system that is capable of identifying six basic emotions (happy, surprise, sad, angry, fear, disgust) in novel face images. An ensemble of simple feed-forward neural networks are used to rate each of the images. The outputs of these networks are then combined to generate a score for each emotion. The networks were trained on a database of face images ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Mathematics

سال: 2023

ISSN: ['2227-7390']

DOI: https://doi.org/10.3390/math11020279